智能论文笔记

Scalable Primitives for Generalized Sensor Fusion in Autonomous Vehicles

Sammy Sidhu , Linda Wang , Tayyab Naseer , Ashish Malhotra , Jay Chia , Aayush Ahuja , Ella Rasmussen , Qiangui Huang , Ray Gao

分类：计算机视觉 | 机器人

2021-12-01

在自主驾驶中，在使用深神经网络的爆炸中爆炸用于感知，预测和规划任务。由于自主车辆（AVS）更接近生产，多模态传感器输入和具有不同传感器平台的异构车队在该行业中变得越来越普遍。然而，神经网络架构通常是针对特定的传感器平台，并且对输入的变化并不稳健，使得缩放和模型部署的问题特别困难。此外，大多数玩家仍然将软件和硬件的问题视为完全独立的问题。我们提出了一个新的终端架构，广义传感器融合（GSF），其设计成使得传感器输入和目标任务都是模块化和可修改的。这使AV系统设计人员能够轻松地使用不同的传感器配置和方法进行实验，并使用在大型工程组织中共享的相同型号开辟了在异构船队上部署的能力。使用该系统，我们报告了实验结果，我们展示了昂贵的高密度（HD）激光雷达传感器的近似奇偶阶段，具有3D对象检测任务中的廉价低密度（LD）LIDAR加相机设置。这为行业铺平了道路，共同设计硬件和软件架构以及具有异质配置的大船队。

translated by 谷歌翻译

A fine-grained comparison of pragmatic language understanding in humans and language models

Jennifer Hu , Sammy Floyd , Olessia Jouravlev , Evelina Fedorenko , Edward Gibson

分类：自然语言处理 | 人工智能

2022-12-13

Pragmatics is an essential part of communication, but it remains unclear what mechanisms underlie human pragmatic communication and whether NLP systems capture pragmatic language understanding. To investigate both these questions, we perform a fine-grained comparison of language models and humans on seven pragmatic phenomena, using zero-shot prompting on an expert-curated set of English materials. We ask whether models (1) select pragmatic interpretations of speaker utterances, (2) make similar error patterns as humans, and (3) use similar linguistic cues as humans to solve the tasks. We find that the largest models achieve high accuracy and match human error patterns: within incorrect responses, models favor the literal interpretation of an utterance over heuristic-based distractors. We also find evidence that models and humans are sensitive to similar linguistic cues. Our results suggest that even paradigmatic pragmatic phenomena may be solved without explicit representations of other agents' mental states, and that artificial models can be used to gain mechanistic insights into human pragmatic processing.

translated by 谷歌翻译

TempCLR: Reconstructing Hands via Time-Coherent Contrastive Learning

Andrea Ziani , Zicong Fan , Muhammed Kocabas , Sammy Christen , Otmar Hilliges

分类：计算机视觉

2022-09-01

我们介绍了TemPCLR，这是一种针对3D手重建的结构化回归任务的新的时代对比学习方法。与以前的手部姿势估计方法相抵触方法不同，我们的框架考虑了其增强方案中的时间一致性，并说明了沿时间方向的手部姿势的差异。我们的数据驱动方法利用了未标记的视频和标准CNN，而无需依赖合成数据，伪标签或专业体系结构。我们的方法在HO-3D和Freihand数据集中分别将全面监督的手部重建方法的性能提高了15.9％和7.6％，从而确立了新的最先进的性能。最后，我们证明了我们的方法会随着时间的推移产生更平滑的手部重建，并且与以前的最新作品相比，对重型的闭塞更为强大，我们在定量和定性上表现出来。我们的代码和模型将在https://eth-ait.github.io/tempclr上找到。

translated by 谷歌翻译

HTML版本

SFP: State-free Priors for Exploration in Off-Policy Reinforcement Learning

Marco Bagatella , Sammy Christen , Otmar Hilliges

分类：机器学习

2022-05-26

有效的探索是深度强化学习的关键挑战。几种方法，例如行为先验，能够利用离线数据，以便在复杂任务上有效加速加强学习。但是，如果手动的任务与所证明的任务过度偏离，则此类方法的有效性是有限的。在我们的工作中，我们建议从离线数据中学习功能，这些功能由更加多样化的任务共享，例如动作与定向之间的相关性。因此，我们介绍了无国有先验，该先验直接在显示的轨迹中直接建模时间一致性，并且即使在对简单任务收集的数据进行培训时，也能够在复杂的任务中推动探索。此外，我们通过从政策和行动之前的概率混合物中动态采样动作，引入了一种新颖的集成方案，用于非政策强化学习中的动作研究。我们将我们的方法与强大的基线相提并论，并提供了经验证据，表明它可以在稀疏奖励环境下的长途持续控制任务中加速加强学习。

translated by 谷歌翻译

AI-enabled Assessment of Cardiac Systolic and Diastolic Function from Echocardiography

Esther Puyol-Antón , Bram Ruijsink , Baldeep S. Sidhu , Justin Gould , Bradley Porter , Mark K. Elliott , Vishal Mehta , Haotian Gu , Miguel Xochicale , Alberto Gomez

分类：计算机视觉

2022-03-21

左心室（LV）功能是心脏病患者的患者管理，结局和长期存活方面的重要因素。最近发表的心力衰竭临床指南认识到，仅依赖一种心脏功能（LV射血分数）作为诊断和治疗分层生物标志物的依赖是次优。基于AI的超声心动图分析的最新进展已在LV体积和LV射血分数的自动估计上显示出良好的结果。但是，从随时间变化的2D超声心动图摄取，可以通过从完整的心脏周期中估算功能性生物标志物来获得对心脏功能的更丰富的描述。在这项工作中，我们首次提出了一种基于全心脏周期分割的2D超声心动图的AI方法，用于从2D超声心动图中得出高级生物标志物。这些生物标志物将允许临床医生获得健康和疾病中心脏的丰富图片。 AI模型基于“ NN-UNET”框架，并使用四个不同的数据库进行了训练和测试。结果表明，手动分析和自动分析之间的一致性很高，并展示了晚期收缩期和舒张期生物标志物在患者分层中的潜力。最后，对于50例病例的子集，我们在超声心动图和CMR的临床生物标志物之间进行了相关分析，我们在两种方式之间表现出了极好的一致性。

translated by 谷歌翻译

Neural networks with linear threshold activations: structure and algorithms

Sammy Khalife , Amitabh Basu

分类：机器学习

2021-11-15

在本文中，我们在具有线性阈值激活功能的神经网络上提出了新的结果。我们精确地表征了这种神经网络可表示的功能，并且显示2个隐藏层是必要的并且足以表示类中可表示的任何功能。鉴于使用其他流行的激活功能的神经网络的最近精确的可比性调查，这是一个令人惊讶的结果，这些功能使用其他流行的激活功能，如整流的线性单元（Relu）。我们还给出了代表类中任意函数所需的神经网络的大小的精确界限。最后，我们设计了一种算法来解决具有固定架构的这些神经网络的全球最优性的经验风险最小化（ERM）问题。如果输入维度和网络架构的大小被认为是固定常数，则算法的运行时间是数据样本大小的多项式。该算法的意义上是独一无二的，即它适用于任何数量的层数，而先前的多项式时间全局最佳算法仅适用于非常受限制的架构类。

translated by 谷歌翻译

nuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles

Holger Caesar , Juraj Kabzan , Kok Seang Tan , Whye Kit Fong , Eric Wolff , Alex Lang , Luke Fletcher , Oscar Beijbom , Sammy Omari

分类：计算机视觉

2021-06-22

在这项工作中，我们提出了世界上第一个基于闭环ML的自动驾驶计划基准。虽然存在基于ML的ML的越来越多的ML的议员，但缺乏已建立的数据集和指标限制了该领域的进展。自主车辆运动预测的现有基准专注于短期运动预测，而不是长期规划。这导致了以前的作品来使用基于L2的度量标准的开放循环评估，这不适合公平地评估长期规划。我们的基准通过引入大规模驾驶数据集，轻量级闭环模拟器和特定于运动规划的指标来克服这些限制。我们提供高质量的数据集，在美国和亚洲的4个城市提供1500h的人类驾驶数据，具有广泛不同的交通模式（波士顿，匹兹堡，拉斯维加斯和新加坡）。我们将提供具有无功代理的闭环仿真框架，并提供一系列一般和方案特定的规划指标。我们计划在Neurips 2021上发布数据集，并在2022年初开始组织基准挑战。

translated by 谷歌翻译

Describing Textures in the Wild

Mircea Cimpoi , Subhransu Maji , Iasonas Kokkinos , Sammy Mohamed , Andrea Vedaldi

分类：

2013-11-14

Patterns and textures are defining characteristics of many natural objects: a shirt can be striped, the wings of a butterfly can be veined, and the skin of an animal can be scaly. Aiming at supporting this analytical dimension in image understanding, we address the challenging problem of describing textures with semantic attributes. We identify a rich vocabulary of forty-seven texture terms and use them to describe a large dataset of patterns collected "in the wild". The resulting Describable Textures Dataset (DTD) is the basis to seek for the best texture representation for recognizing describable texture attributes in images. We port from object recognition to texture recognition the Improved Fisher Vector (IFV) and show that, surprisingly, it outperforms specialized texture descriptors not only on our problem, but also in established material recognition datasets. We also show that the describable attributes are excellent texture descriptors, transferring between datasets and tasks; in particular, combined with IFV, they significantly outperform the state-of-the-art by more than 8% on both FMD and KTH-TIPS-2b benchmarks. We also demonstrate that they produce intuitive descriptions of materials and Internet images.

translated by 谷歌翻译